Phone-based speech synthesis with neural network and articulatory control
نویسندگان
چکیده
This paper presents a novel method for synthesizing speech signal using a phone-based concatenation approach. Neural network is employed for the generalization of the phone templates during synthesis. Simpli ed articulatory space input parameters based on a modi ed vowel diagram are used to provide exible and e ective articulatory control. It also enables the design of an articulatory control model for allophonic variations in speech signal. The network approach is chosen for its non-linear mapping of the relationship between the articulatory space parameters and the spectral information of speech signal. In addition, non-linear approximation for phone template transitions is facilitated. The phone templates of the synthesizer are implicitly stored as network parameters of a medium size network. The performance of this new speech synthesis technique is demonstrated with a prototype system speci cally designed for Cantonese (a common Chinese dialect) and the synthetic speech quality is assessed by informal listening tests.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملArticulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings
Automatic prediction of articulatory movements from speech or text can be beneficial for many applications such as speech recognition and synthesis. A recent approach has reported stateof-the-art performance in speech-to-articulatory prediction using feed forward neural networks. In this paper, we investigate the feasibility of using bidirectional long short-term memory based recurrent neural n...
متن کاملAn adaptive neural control scheme for articulatory synthesis of CV sequences
Reproducing the smooth vocal tract trajectories is critical for high quality articulatory speech synthesis. This paper presents an adaptive neural control scheme based on fuzzy logic and neural networks. The control scheme estimates motor commands from trajectories of flesh-points on selected articulators. These motor commands are then used to reproduce the trajectories based on 2nd order dynam...
متن کاملRobust articulatory speech synthesis using deep neural networks for BCI applications
Brain-Computer Interfaces (BCIs) usually propose typing strategies to restore communication for paralyzed and aphasic people. A more natural way would be to use speech BCI directly controlling a speech synthesizer. Toward this goal, a prerequisite is the development a synthesizer that should i) produce intelligible speech, ii) run in real time, iii) depend on as few parameters as possible, and ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کامل